108 research outputs found

    Measuring the influence of concept detection on video retrieval

    Get PDF
    There is an increasing emphasis on including semantic concept detection as part of video retrieval. This represents a modality for retrieval quite different from metadata-based and keyframe similarity-based approaches. One of the premises on which the success of this is based, is that good quality detection is available in order to guarantee retrieval quality. But how good does the feature detection actually need to be? Is it possible to achieve good retrieval quality, even with poor quality concept detection and if so then what is the 'tipping point' below which detection accuracy proves not to be beneficial? In this paper we explore this question using a collection of rushes video where we artificially vary the quality of detection of semantic features and we study the impact on the resulting retrieval. Our results show that the impact of improving or degrading performance of concept detectors is not directly reflected as retrieval performance and this raises interesting questions about how accurate concept detection really needs to be

    Content vs. context for multimedia semantics: the case of SenseCam image structuring

    Get PDF
    Much of the current work on determining multimedia semantics from multimedia artifacts is based around using either context, or using content. When leveraged thoroughly these can independently provide content description which is used in building content-based applications. However, there are few cases where multimedia semantics are determined based on an integrated analysis of content and context. In this keynote talk we present one such example system in which we use an integrated combination of the two to automatically structure large collections of images taken by a SenseCam, a device from Microsoft Research which passively records a person’s daily activities. This paper describes the post-processing we perform on SenseCam images in order to present a structured, organised visualisation of the highlights of each of the wearer’s days

    Synchronous collaborative information retrieval: techniques and evaluation

    Get PDF
    Synchronous Collaborative Information Retrieval refers to systems that support multiple users searching together at the same time in order to satisfy a shared information need. To date most SCIR systems have focussed on providing various awareness tools in order to enable collaborating users to coordinate the search task. However, requiring users to both search and coordinate the group activity may prove too demanding. On the other hand without effective coordination policies the group search may not be effective. In this paper we propose and evaluate novel system-mediated techniques for coordinating a group search. These techniques allow for an effective division of labour across the group whereby each group member can explore a subset of the search space.We also propose and evaluate techniques to support automated sharing of knowledge across searchers in SCIR, through novel collaborative and complementary relevance feedback techniques. In order to evaluate these techniques, we propose a framework for SCIR evaluation based on simulations. To populate these simulations we extract data from TREC interactive search logs. This work represent the first simulations of SCIR to date and the first such use of this TREC data

    Relevance feedback and query expansion for searching the web: a model for searching a digital library

    Get PDF
    A fully operational large scale digital library is likely to be based on a distributed architecture and because of this it is likely that a number of independent search engines may be used to index different overlapping portions of the entire contents of the library. In any case, different media, text, audio, image, etc., will be indexed for retrieval by different search engines so techniques which provide a coherent and unified search over a suite of underlying independent search engines are thus likely to be an important part of navigating in a digital library. In this paper we present an architecture and a system for searching the world's largest DL, the world wide web. What makes our system novel is that we use a suite of underlying web search engines to do the bulk of the work while our system orchestrates them in a parallel fashion to provide a higher level of information retrieval functionality. Thus it is our meta search engine and not the underlying direct search engines that provide the relevance feedback and query expansion options for the user. The paper presents the design and architecture of the system which has been implemented, describes an initial version which has been operational for almost a year, and outlines the operation of the advanced version

    Content-Based Video Description for Automatic Video Genre Categorization

    Get PDF
    International audienceIn this paper, we propose an audio-visual approach to video genre categorization. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At temporal structural level, we asses action contents with respect to human perception. Further, color perception is quantified with statistics of color distribution, elementary hues, color properties and relationship of color. The last category of descriptors determines statistics of contour geometry. An extensive evaluation of this multi-modal approach based on more than 91 hours of video footage is presented. We obtain average precision and recall ratios within [87% − 100%] and [77% − 100%], respectively,nwhile average correct classification is up to 97%. Additionally, movies displayed according to feature-based coordinates in a virtual 3D browsing environment tend to regroup with respect to genre, which has potential application with real content-based browsing systems

    Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment

    Get PDF
    VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided. The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language. Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names

    Are Visual Informatics Actually Useful in Practice: A Study in a Film Studies Context

    Get PDF
    This paper describes our work in examining the question of whether providing a visual informatics application in an educational scenario, in particular, providing video content analysis, does actually yield real benefit in practice. We provide a new software tool in the domain of movie content analysis technologies for use by students of film studies students at Dublin City University, and we try to address the research question of measuring the ‘benefit’ from the use of these technologies to students. We examine their real practices in studying for the module using our advanced application as compared to using conventional DVD browsing of movie content. In carrying out this experiment, we found that students have better essay outcomes, higher satisfactions levels and the mean time spent on movie analyzing is longer with the new technologies

    Predicting livestock behaviour using accelerometers: A systematic review of processing techniques for ruminant behaviour prediction from raw accelerometer data

    Get PDF
    peer-reviewedPrecision Technologies are emerging in the context of livestock farming to improve management practices and the health and welfare of livestock through monitoring individual animal behaviour. Continuously collecting information about livestock behaviour is a promising way to address several of these target areas. Wearable accelerometer sensors are currently the most promising system to capture livestock behaviour. Accelerometer data should be analysed properly to obtain reliable information on livestock behaviour. Many studies are emerging on this subject, but none to date has highlighted which techniques to recommend or avoid. In this paper, we systematically review the literature on the prediction of livestock behaviour from raw accelerometer data, with a specific focus on livestock ruminants. Our review is based on 66 surveyed articles, providing reliable evidence of a 3-step methodology common to all studies, namely (1) Data Collection, (2) Data Pre-Processing and (3) Model Development, with different techniques used at each of the 3 steps. The aim of this review is thus to (i) summarise the predictive performance of models and point out the main limitations of the 3-step methodology, (ii) make recommendations on a methodological blueprint for future studies and (iii) propose lines to explore in order to address the limitations outlined. This review shows that the 3-step methodology ensures that several major ruminant behaviours can be reliably predicted, such as grazing/eating, ruminating, moving, lying or standing. However, the areas faces two main limitations: (i) Most models are less accurate on rarely observed or transitional behaviours, behaviours may be important for assessing health, welfare and environmental issues and (ii) many models exhibit poor generalisation, that can compromise their commercial use. To overcome these limitations we recommend maximising variability in the data collected, selecting pre-processing methods that are appropriate to target behaviours being studied, and using classifiers that avoid over-fitting to improve generalisability. This review presents the current situation involving the use of sensors as valuable tools in the field of behaviour recording and contributes to the improvement of existing tools for automatically monitoring ruminant behaviour in order to address some of the issues faced by livestock farming

    Designing novel applications for emerging multimedia technology

    Get PDF
    Current R&D in media technologies such as Multimedia, Semantic Web and Sensor Web technologies are advancing in a fierce rate and will sure to become part of our important regular items in a 'conventional' technology inventory in near future. While the R&D nature of these technologies means their accuracy, reliability and robustness are not sufficient enough to be used in real world yet, we want to envision now the near-future where these technologies will have matured and used in real applications in order to explore and start shaping many possible new ways these novel technologies could be utilised. In this talk, some of this effort in designing novel applications that incorporate various media technologies as their backend will be presented. Examples include novel scenarios of LifeLogging application that incorporate automatic structuring of millions of photos passively captured from a SenseCam (wearable digital camera that automatically takes photos triggered by environmental sensors) and an interactive TV application incorporating a number of multimedia tools yet extremely simple and easy to use with a remote control in a lean-back position. The talk will conclude with remarks on how the design of novel applications that have no precedence or existing user base should require somewhat different approach from those suggested and practiced in conventional usability engineering methodology
    corecore